A Convex Formulation for Semi-Supervised Multi-Label Feature Selection
نویسندگان
چکیده
Explosive growth of multimedia data has brought challenge of how to efficiently browse, retrieve and organize these data. Under this circumstance, different approaches have been proposed to facilitate multimedia analysis. Several semi-supervised feature selection algorithms have been proposed to exploit both labeled and unlabeled data. However, they are implemented based on graphs, such that they cannot handle large-scale datasets. How to conduct semi-supervised feature selection on large-scale datasets has become a challenging research problem. Moreover, existing multi-label feature selection algorithms rely on eigen-decomposition with heavy computational burden, which further prevent current feature selection algorithms from being applied for big data. In this paper, we propose a novel convex semi-supervised multi-label feature selection algorithm, which can be applied to large-scale datasets. We evaluate performance of the proposed algorithm over five benchmark datasets and compare the results with stateof-the-art supervised and semi-supervised feature selection algorithms as well as baseline using all features. The experimental results demonstrate that our proposed algorithm consistently achieve superiors performances.
منابع مشابه
READER: Robust Semi-Supervised Multi-Label Dimension Reduction
Multi-label classification is an appealing and challenging supervised learning problem, where multiple labels, rather than a single label, are associated with an unseen test instance. To remove possible noises in labels and features of high-dimensionality, multi-label dimension reduction has attracted more and more attentions in recent years. The existing methods usually suffer from several pro...
متن کاملMLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection
Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...
متن کاملMulti-Label Classification with Unlabeled Data: An Inductive Approach
The problem of multi-label classification has attracted great interests in the last decade. Multi-label classification refers to the problems where an example that is represented by a single instance can be assigned tomore than one category. Until now, most of the researches on multi-label classification have focused on supervised settings whose assumption is that large amount of labeled traini...
متن کاملSemi-supervised Feature Analysis for Multimedia Annotation by Mining Label Correlation
In multimedia annotation, labeling a large amount of training data by human is both time-consuming and tedious. Therefore, to automate this process, a number of methods that leverage unlabeled training data have been proposed. Normally, a given multimedia sample is associated with multiple labels, which may have inherent correlations in real world. Classical multimedia annotation algorithms add...
متن کاملFast semi-supervised SVM classifiers using a priori metric information
This paper describes a support vector machine-based (SVM) parametric optimization method for semi-supervised classification, called LIAM (for LInear hyperplane classifier with A-priori Metric information). Our method takes advantage of similarity information to leverage the unlabeled data in training SVMs. In addition to the smoothness constraints in existing semi-supervised methods, LIAM incor...
متن کامل